Representation of k-Mer Sets Using Spectrum-Preserving String Sets
نویسندگان
چکیده
منابع مشابه
Compact Universal k-mer Hitting Sets
We address the problem of finding a minimum-size set of k-mers that hits L-long sequences. The problem arises in the design of compact hash functions and other data structures for efficient handling of large sequencing datasets. We prove that the problem of hitting a given set of L-long sequences is NP-hard and give a heuristic solution that finds a compact universal k-mer set that hits any set...
متن کاملGaKCo: A Fast Gapped k-mer String Kernel Using Counting
String Kernel (SK) techniques, especially those using gapped k-mers as features (gk), have obtained great success in classifying sequences like DNA, protein, and text. However, the state-of-the-art gk-SK runs extremely slow when we increase the dictionary size (Σ) or allow more mismatches (M). This is because current gk-SK uses a trie-based algorithm to calculate cooccurrence of mismatched subs...
متن کاملPrivacy Preserving in Clustering using Fuzzy Sets
Data mining techniques, in spite of their benefit in a wide range of applications have also raised threat to privacy and data security. This paper addresses the problem of preserving privacy of individuals when data is shared. Sharing the entire data not only provides irrelevant information to the miner but also makes the data vulnerable to privacy violation. Dimensionality of data is reduced a...
متن کاملk-Sets and k-Facets
We survey problems, results, and methods concerning k-facets and k-sets of finite point sets in real affine d-space and the dual notion of levels in arrangements of hyperplanes.
متن کاملEla Matrix Functions Preserving Sets
Matrix functions preserving several sets of generalized nonnegative matrices are characterized. These sets include PFn, the set of n×n real eventually positive matrices; and WPFn, the set of matrices A ∈ R such that A and its transpose have the Perron-Frobenius property. Necessary conditions and sufficient conditions for a matrix function to preserve the set of n× n real eventually nonnegative ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Computational Biology
سال: 2021
ISSN: 1557-8666
DOI: 10.1089/cmb.2020.0431